A Rule-Based Tagger for Polish Based on Genetic Algorithm

نویسندگان

  • Maciej Piasecki
  • Bartlomiej Gawel
چکیده

In the paper an approach to the construction of rule-based morphosyntactic tagger for Polish is proposed. The core of the tagger are modules of rules (classification systems), acquired from the IPI PAN corpus by application of Genetic Algorithms. Each module is specialised in making decisions concerning different parts of a tag (a structure of attributes). The acquired rules are combined with linguistic rules made by hand and memory-based rules acquired also from the corpus. The construction of the tagger and experiments concerning its properties are also presented in the paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault Detection of Bearings Using a Rule-based Classifier Ensemble and Genetic Algorithm

This paper proposes a reduct construction method based on discernibility matrix simplification. The method works with genetic algorithm. To identify potential problems and prevent complete failure of bearings, a new method based on rule-based classifier ensemble is presented. Genetic algorithm is used for feature reduction. The generated rules of the reducts are used to build the candidate base...

متن کامل

Optimisation of Polish Tagger Parameters

The large tagset of the IPI PAN Corpus of Polish enforced a modular architecture of the Polish tagger called TaKIPI. The architecture introduce several parameters, for learning and tagging, that are difficult to be properly adjusted manually. In this paper a method of optimisation of the parameters values based on Genetic Algorithm is presented. A chromosome is a set of values, a specimen is a ...

متن کامل

Adaptive Rule-Base Influence Function Mechanism for Cultural Algorithm

This study proposes a modified version of cultural algorithms (CAs) which benefits from rule-based system for influence function. This rule-based system selects and applies the suitable knowledge source according to the distribution of the solutions. This is important to use appropriate influence function to apply to a specific individual, regarding to its role in the search process. This rule ...

متن کامل

Multiclassifier Approach to Tagging of Polish

The large tagset, the limited size of corpora and the free word order are the main causes for achieving low accuracy of tagging Polish by applying the commonly used techniques based on stochastic modelling. The proposed architecture of the Polish tagger called TaKIPI created the possibility for using different types of classifiers in tagging, but only C4.5 Decision Trees were applied initially....

متن کامل

POLISH TAGGER TaKIPI: RULE BASED CONSTRUCTION AND OPTIMISATION

A large number of different tags, limited corpora and the free word order are the main causes of low accuracy of tagging in Polish (automatic disambiguation of morphological descriptions) by applying commonly used techniques based on stochastic modelling. In the paper the rule-based architecture of the TaKIPI Polish tagger combining handwritten and automatically extracted rules is presented. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005